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Abstract 

Background: Polyploidy is an important evolutionary mechanism in flowering plants that often induces immediate 
extensive changes in gene expression through genomic merging and doubling. Brassica napus L. is one of the 
most economically important polyploid oil crops and has been broadly studied as an example of polyploid crop. 
RNA-seq is a recently developed technique for transcriptome study, which could be in choice for profiling gene 
expression pattern in polyploids. 

Results: We examined the global gene expression patterns of the first four generations of resynthesized B. napus 
(F-|-F 4 ), its diploid progenitors B. rapa and B. oleracea, and natural B. napus using digital gene expression analysis. 
Almost 42 million clean tags were generated using lllumina technology to produce the expression data for 25959 
genes, which account for 63% of the annotated B. rapa genome. More than 56% of the genes were transcribed 
from both strands, which indicate the importance of RNA-mediated gene regulation in polyploidization. Tag 
mapping of the B. rapa genome generated 19023, 18547, 24383, 20659, 18881, 20692, and 19955 annotated genes 
for the B. rapa, B. oleracea, F q -F 4 of synthesized B. napus, and natural B. napus libraries, respectively. The 
unambiguous tag-mapped genes in the libraries were functionally categorized via gene ontological analysis. 
Thousands of differentially expressed genes (DEGs) were identified and revealed the substantial changes in F n — F 4 . 
Among the 20 most DEGs are DNA binding/transcription factor, cyclin-dependent protein kinase, epoxycarotenoid 
dioxygenase, and glycine-rich protein. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the DEGs 
suggested approximately 120 biological pathways. 

Conclusions: The systematic deep sequencing analysis provided a comprehensive understanding of the 
transcriptome complexity of early generations of synthesized B. napus. This information broadens our 
understanding of the mechanisms of B. napus polyploidization and contributes to molecular and genetic research 
by enriching the Brassica database. 



Background 

Polyploidization is an ancient and ongoing evolutionary 
process that promotes plant evolution by reshaping 
genomes [1,2]. The majority of flowering plants have 
undergone polyploidization (complete or partial) and 
chromosome rearrangement [3]. Generally, polyploids 
are divided into allopolyploids and autopolyploids [4,5]. 
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Many major crops, including wheat, cotton, oat, coffee, 
and oilseed rape, are fundamentally allopolyploids [6-8]. 
Recently, many studies have revealed that methylation sta- 
tus, gene expression profile, chromosomal rearrangements, 
deletions, insertions, and sequence substitutions in many 
allopolyploids differ to their progenitor [9,10]. Birchler and 
Veitia [11] proposed the gene balance theory with regard 
to quantitative traits and gene duplication following poly- 
ploidy [11]. Angiosperm genome plasticity in polyploids is 
always related to changes in gene expression, which are 
largely controlled by epigenetic profiles [12,13]. Gaeta and 
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Pires [14] concluded that these genomic changes occur in 
both natural and resynthesized polyploids [14]. Molecular 
analyses of natural and resynthesized allopolyploids indi- 
cate that genetic and epigenetic changes are common 
results of polyploidization in different species [5,15,16]. 
Discovering the structural and functional evolution of gen- 
omes during polyploidization is of great importance in 
plant biology [17]. 

Cultivated Brassica species include important econom- 
ical crops mostly closely related to Arabidopsis thaliana 
and provide great chances for studying the effects of poly- 
ploidization [3]. The availability of the Arabidopsis genome 
effectively facilitates studies on Brassica polyploidization 
[18]. Soon after the Arabidopsis and Brassica lineage 
diverged -17.0 million years ago (Mya), the triplicated 
Brassica subgenomes divergence was estimated to be 
14.3 Mya [17,19]. B. rapa (A genome) and B. oleracea 
(C genome) descended from a common hexaploid an- 
cestor of A. thaliana. Moreover, the lineage of B. rapa 
and B. oleracea diverged around 3.5 Mya [20], and a recent 
segmental duplication in B. rapa occurred -0.8 Mya [21]. 
Studies on A- and C-genome mapping, genome compari- 
son, and genome evolution have been performed during 
the past decades. Additional genome duplication aside 
from triplication, as well as complex chromosomal changes 
was revealed in B. oleracea [22,23]. B. rapa, -529 Mb per 
haploid, was first launched for complete genome sequen- 
cing. The allopolyploid B. napus (AC, n = 19) was spontan- 
eously derived from hybridization of A and C progenitors 
[24]. The genome of natural B. napus was confirmed intact 
without rearrangement, but resynthesized B. napus under- 
went rapid changes in the early generations, including 
genetic changes and methylation changes [25-28]. Non- 
additive proteins and additive proteins were completely 
identified in resynthesized B. napus, indicating the early 
steps of allopolyploidization repatterning are controlled by 
nonstochastic mechanisms [10,29]. Xiong et al. [30] indi- 
cated that aneuploidy, gross chromosomal rearrangements, 
and dosage balance maintain the genomic stability of 
synthesized B. napus [30] . 

Considering microarrays enable the comparison of gene 
expression at the transcriptome level, a set of Brassica 
unigenes assembled using Brassica expressed sequence 
tags (ESTs) was developed and assigned to discriminate 
paralogous genes, but not homologous genes, between the 
A- and the C-genome [6,31]. Recently emerged next gen- 
eration sequencing (NGS) technology is an alternative 
method for better genome, epigenome, and transcriptome 
study [32,33]. Many plant species have benefited from 
this technology, including B. rapa and B. napus. The draft 
genome sequence of B. rapa accession Chiifu-401-42 
was newly released by Illumina GA II technology and 
annotated [34], which provides an important resource 
for studying the evolution of polyploid genomes. Leaf 



transcriptome of B. napus had been dissected by se- 
quencing [35,36]. 

In the present research, we conducted a digital gene 
expression (DGE) analysis on resynthesized B. napus 
across the F!-F 4 generations to address the transcriptome 
changes after polyploidization, which were also compared 
with their genetic progenitors {B. rapa and B. oleracea) 
and natural B. napus. Previous studies on proteomic 
changes in this population revealed that differentially 
expressed proteins in F 2 differed from the progenitors, 
exhibiting non-additive repatterning. Furthermore, gene 
silencing during polyploidization induces differences in 
protein expression in different generations of synthesized 
B. napus [37]. We report the expression profile of genes in 
resynthesized B. napus, show the upregulation of essential 
pathways and genes in Fx, and compare them with pro- 
genitors, which are downregulated in the F 2 -F 4 genera- 
tions. This is the first comprehensive transcriptomic 
research that identifies DEGs and the pathways involved 
in first four generations of synthesized B. napus after 
polyploidization. 

Results 

Digital gene expression (DGE) profile 

To investigate global transcriptome during the polyploidi- 
zation of B. napus, we performed a DGE analysis on the 
seedling stage of resynthesized B. napus (Fx-FJ and its 
diploid parents, as well as natural B. napus. Finally, DGE 
libraries from the leaves of four-week-old plants were gen- 
erated and sequenced by Illumina technology. The se- 
quence data are available from the GEO repository, 
accession number GSE43246. The statistics of the DGE 
tags is shown in Table 1. Approximately six million raw 
tags were generated for each library, and more than 97% 
of the raw tags were clean tags. A total of 6178564 raw 
tags were obtained from B. rapa (Br), 6059222 from B. 
oleracea (Bo), 6155227 from B. napus-F 1 (Bn-F^, 6092805 
from B. napus-F 2 (Bn-F 2 ), 6142098 from B. napus-F 3 
(Bn-F 3 ), 5938583 from B. napus-Y 4 (Bn-F 4 ), and 5964594 
from natural B. napus (Bn-N). After removing the low- 
quality sequences and adapter sequences, 6018254, 
5930726, 6022170, 5950123, 5991210, 5798939, and 
5823113 clean tags were obtained with 21 nt in length 
in the corresponding species. Unambiguous tags were 
counted and normalized to the number of transcripts per 
million tags (TPM) to evaluate the gene expression level. 
The results show that the mRNA transcribed from major 
genes had fewer than ten copies and only a small propor- 
tion of genes were highly expressed. The distribution of 
clean tags in the seven libraries was determined to evalu- 
ate the normality of the dataset. The results show a con- 
sistent pattern, with most of the tags coming from highly 
expressed genes. In this analysis, the total number of clean 
tags is the sum of all clean tags and the number of distinct 
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Table 1 Statistics of categorization and abundance of DGE tags 



Summary 




B. rapa 


B. oleracea 


B. napus-F-i 


B. napus-F 2 


B. napus-F 3 


B. napus-F 4 


B. napus-H 


Raw Data 


Total 


6178564 


6059222 


6155227 


6092805 


6142098 


5938583 


5964594 


Raw Data 


Distinct Tag 


293575 


243895 


331406 


275597 


284202 


272986 


269285 


Clean tag 


Total number 


6018254 


5930726 


6022170 


5950123 


5991210 


5798939 


5823113 


Clean tag 


Distinct Tag 
number 


1 33499 


1 16771 


I yy4zo 


1 34403 


1 34721 


1 34857 


1 28967 


Tag Mapping to Gene 


Total number 


1 r\f a r\r\r\ 

1 964909 


1 747843 


3025405 


2354059 


1652733 


2295897 


2253347 


Tag Mapping to Gene 


Distinct Tag 
number 


44267 


36220 


88270 


49781 


42613 


51122 


45358 


Unambiguous Tag 
Mapping to Gene 


Total number 


1 679848 


1475050 


249541 1 


2016182 


1415736 


1971591 


1 924944 


Unambiguous Tag 
Mapping to Gene 


Total % of clean 
tag 


27.91% 


24.87% 


41.44% 


33.88% 


23.63% 


34.00% 


33.06% 


Unambiguous Tag 
Mapping to Gene 


Distinct Tag 
number 


39414 


31933 


79648 


44456 


37996 


45762 


40561 


Unambiguous Tag 
Mapping to Gene 


Distinct Tag % of 
clean tag 


29.52% 


27.35% 


39.94% 


33.08% 


28.20% 


33.93% 


31.45% 


Tag-mapped Genes 


number 


19023 


18547 


24383 


20659 


18881 


20692 


19955 


Tag-mapped Genes 


% of ref genes 


46.20% 


45.05% 


59.22% 


50.17% 


45.86% 


50.26% 


48.47% 


Unambiguous 
Tag-mapped Genes 


number 


16574 


15970 


22059 


18155 


16479 


18196 


17448 


Unambiguous 
Tag-mapped Genes 


% of ref genes 


40.25% 


38.79% 


53.58% 


44.09% 


40.02% 


a a 1 on/ 

44. 1 9% 


42.38% 


Mapping to Genome 


Total number 


2437918 


2105332 


1 776978 


2290544 


2658991 


2232572 


2164464 


Mapping to Genome 


Total % of clean 
tag 


40.5 1 % 


35.50% 


29.51% 


38.50% 


A A o on/ 
44.00% 


38.50% 


37.1 7% 


Mapping to Genome 


Distinct Tag 
number 


44076 


30703 


49898 


42334 


45680 


42590 


40689 


Mapping to Genome 


Distinct i ag to ot 
clean tag 


33 mo/, 
oo.UzTO 


Zo.Zyyo 


~>c: mo/, 

ZO.UZTO 


3 1 C.P\0L 

o I .0U70 


3 3 Q 1 0/rs 

OO.y I 70 


3 1 <^Q0/„ 
0 I .0070 


3 1 CCOL 
0 I .0070 


Unknown Tag 


Total number 


1615427 


2077551 


1219787 


1305520 


1679486 


1270470 


1405302 


Unknown Tag 


Total % of clean 
tag 


26.84% 


35.03% 


20.25% 


21.94% 


28.03% 


21.91% 


24.13% 


Unknown Tag 


Distinct Tag 
number 


45156 


49848 


61260 


42288 


46428 


41145 


42920 


Unknown Tag 


Distinct Tag % of 
clean tag 


33.82% 


42.69% 


30.72% 


31.46% 


34.46% 


30.51% 


33.28% 



Clean tags are tags after filtering dirty tags from raw data, 
removing tags mapped to reference genome. 



Distinct tags are different kinds of tags and unambiguous tags are the reminder clean tags after 



clean tags is the number of different clean tags (Figure 1, 
Additional file 1: SI and Additional file 2: S2). We found 
that the percentage of distinct tags with high counts 
dropped dramatically, and the distinct tags with more than 
100 copies accounted less than 10%. However, more than 
70% of total clean tags have accounts above 100 in each li- 
brary. By contrast, 50% of the distinct clean tags had copy 
numbers between two and five, but they only represented 
around 4% of the total number of clean tags, which indi- 
cates only a small number of mRNA were expressed at 
high abundance and the majority were expressed at very 
low levels [38]. The clean tags were then mapped onto the 
B. rapa genome [34] and the numbers of tags that could be 



mapped onto genes with a maximum of one base pair mis- 
match in Br, Bo, Bn-Fi, Bn-F 2 , Bn-F 3 , Bn-F 4> and Bn-N were 
1964909, 1747843, 3025405, 2354059, 1652733, 2295897, 
and 2253347, respectively (Table 1). Statistical analysis of 
clean tag alignment was conducted, including analysis of 
total clean tags and distinct clean tags (Additional file 3: S3, 
Additional file 2: S2). We found that around 70% of total 
clean tags were mapped onto the B. rapa genome with per- 
fect match or with 1 bp mismatch to sense or anti-sense 
genes, and approximately 65% of the distinct clean tags 
were successfully mapped. Finally, the tag mapping onto 
the B. rapa genome generated 19023 tag-mapped genes for 
Br, 18547 for Bo, 24383 for Bn-F x , 20659 for Bn-F 2 , 18881 
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Figure 1 Distribution of total tag and distinct tag counts over different tag abundance categories from the seven libraries. 



for Bn-F 3 , 20692 for Bn-F 4 , and 19955 for Bn-N. In total, 
25959 genes were identified from the seven libraries, which 
accounted for approximately 63% of genes in the annotated 
B. rapa genome (Additional file 4: S4). 

To determine whether the sequencing depth was suffi- 
cient for the transcriptome coverage, a saturation analysis 
was performed to check whether the number of detected 
genes increased with sequencing amount until the number 
of reads reached 2 million (Additional file 5: S5). We also 
analyzed the distribution of the ratio of distinct tag 
copy numbers in each pair of libraries and found that 
more than 90% of the distinct tags had ratios up to 
5-fold (Additional file 6: S6). Antisense genes play an 
important role in gene expression and regulation, and 
about 36% (14775) of the genes were matched by dis- 
tinct tags in antisense orientations. Up to 5726 genes 
were mapped by DGE tags in both sense and antisense 
orientations in the Br library, 5123 in the Bo library, 
11903 in the Bn-Fi library, 6568 Bn-F 2 , 5692 in the Bn-F 3 
library, 6887 in the Bn-F 4 library, and 6253 in the Bn-N 
library (Additional file 4: S4). In total, more than 56% of 
the genes (14775 of 25959 genes) were transcribed from 
both strands, which indicate the importance of RNA- 
mediated gene regulation in polyploidization. 

Differentially expressed genes in early generations of 
resynthesized B. napus 

To identify genes that were differentially expressed in 
the early generations of synthesized B. napus after poly- 
ploidization, we compared pairs of DGE profiles of the 
seven libraries (Br versus Bn-Fi, Bo versus Bn-Fi, Bn-F x 
versus Bn-F 2 , Bn-F! versus Bn-F 3 , Bn-F! versus Bn-F 4 , 
and Bn-N versus Bn-F 1> where A was the control and B 
was experimental group in A versus B') to analyze gene 
expression variations (Figure 2 and Additional file 7: S7). 
The comparison of B. napus-Y x with B. rapa revealed 
that 4197 DEGs were significantly upregulated and 360 



DEGs were downregulated in B. napus-Y x in comparison 
with B. rapa. By contrast, 4554 DEGs were downregu- 
lated and 886 DEGs were upregulated in B. napus-Y x 
compared with B. oleracea. The number of upregulated 
DEGs in B. napus-¥ x was more than that downregulated 
after polyploidization, which might indicate heterosis. 
Apart from the DEGs among resynthesized B. napus-¥ x 
and its progenitors, the number of the identified DEGs 
differed in the early generations of synthesized B. napus, 
and most of the DEGs showed downregulation in the 
F2-F4 generations compared with the F 2 generation. Up 
to 3022 DEGs were downregulated and 507 were upre- 
gulated in B. napus-F 2 compared with B. napus-¥^ 4718 
DEGs were downregulated and 502 were upregulated in 
B. napus-F 3 compared with B. napus-Y^ 2882 DEGs 
were downregulated and 545 DEGs were upregulated in 
B. napus-Y^ and B. napus-¥ v Comparison of resynthe- 
sized and natural B. napus showed 649 tags were downre- 
gulated and 2916 tags were upregulated in B. napus-Y x in 
comparison with natural B. napus. By comparison, the 
lowest number of DEGs was found in the F 4 generation, 
which was in a relatively stable stage after polyploidiza- 
tion. Figure 3 shows the distribution of genes commonly 
expressed in B. rapa, B. oleracea, B. napus-Y^ and natural 
B. napus, which indicates that a number of conserved 
genes were identified apart from the DEGs. Distribution 
of genes commonly and specifically expressed in Bn-F^, 
F 2 , F 3 and F 4 was also analyzed (Figure 4). Among the 
DEGs, the 20 most abundantly expressed genes that were 
upregulated or downregulated during polyploidization are 
listed in Additional file 8: S8. We found that the genes 
that encode DNA binding/transcription factor (Bra039065, 
Bra018796) were more prominent, which were downre- 
gulated in ¥ 1 compared with its progenitors and then upre- 
gulated in the F 2 -F 3 generations compared with the F 2 
generation. Genes encoding cyclin-dependent protein kin- 
ase (Bra016640), epoxycarotenoid dioxygenase (Bra020970), 
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Br vs. Bn-Fl Bo vs. Bn-Fl Bn-Fl vs. Bn-F2 Bn-Fl vs. Bn-F3 Bn-Fl vs. Bn-F4 Bn-N vs. Bn-Fl 



Figure 2 Numbers of differentially expressed genes in each comparison. The numbers of up-regulated (in red) and down-regulated genes 

(in green) are presented. 'A' was the control and 'B' was experimental group in 'A vs. B'. 
k J 



and glycine-rich protein (Bra040817) were upregulated in 
Fi during polyploidization, and then downregulated in 
F 2 -F 4 compared with F^ 

Functional annotation of DEGs 

Annotation of the sequences using GO database yielded 
good results for 22059 unambiguously mapped genes iden- 
tified using the Bn-Fi DGE tags. These well-annotated 
sequences belonged to three main categories (cellular com- 
ponent, molecular function, and biological process) and 
distributed into 33 categories, including the most domin- 
ant pathways such as 'binding and catalytic activity,' cell 
and cell part,' organelle,' cellular and metabolic processes,' 
and response to stimulus' (Figure 5). A similar GO distri- 
bution was revealed in the other six libraries (Additional 
file 9: S9). However, we did not find any genes in the trans- 
porter cluster in Bo. Genes were not found in the extra- 
cellular region cluster in Bn-F 2 . Enzyme regulators and 
translation regulators were not observed in Bn-F 3 . 

To identify the biological pathways active in the DGE 
libraries, we mapped all the annotated genes to terms in 
the KEGG database to search for significantly enriched 
genes involved in metabolic or signal transduction path- 
ways (Additional file 10: S10). Among all the genes with 
KEGG annotation, DEGs were identified in Bn-F! in 
comparison with Br. In total, we assigned 2911 DEGs to 
121 KEGG pathways. Up to 38 of these pathways were 
significantly enriched, with Q values < 0.05 (red border re- 
gion), including metabolic pathways, ribosome, and car- 
bon fixation in photosynthetic organisms. Similar pathway 
enrichment was revealed in pair comparison of each li- 
braries (Bo vs. Bn-Fi, Bn-Fi vs. Bn-F 2 , Bn-Fx vs. Bn-F 3 , 
Bn-Fi vs. Bn-F 4 , and Bn-N vs. Bn-F 1> where A was the 
control and B was experimental group in A vs. B') . The 
3339 DEGs identified in Bn-Fi in comparison with Bo 
were assigned to 122 KEGG pathways, 44 of which were 
significantly enriched. The 2293 DEGs identified in Bn-F 2 
in comparison with Bn-Fi were assigned to 122 KEGG 



pathways, 3322 DEGs identified in Bn-F 3 in comparison 
with Bn-Fi were assigned to 120 pathways, 2229 DEGs 
identified in Bn-F 4 in comparison with Bn-Fi were 
assigned to 119 pathways, and 2283 DEGs identified in 
Bn-Fi in comparison with Bn-N were assigned to 119 
pathways. 

Genome-wide non-additive gene regulation in 
synthesized Bn-F! 

We compared the transcript abundance in synthesized 
Bn-Fi with the relative mid-parent value (MPV: an equal 
mixture of RNA from two parents) to identify genes that 
showed differential expression pattern in Bn-F!. Up to 
19785 genes in Bn-Fi showed differences in expression 
from MPV, 12012 (60.7%) of them showed higher expres- 
sion levels than MPV, and 7773 (39.3%) of these genes 
showed lower expressions than MPV. Among these non- 
additively expressed genes, 9456 (47.8%) genes were 
expressed at higher levels in Br than in Bo, and 10329 



( ^ 

Bn-N Bo 




Figure 3 Distribution of the genes commonly and specifically 
expressed in natural B. napus, synthesized B. napus-F^ and its 
progenitors {B. rapa and B. oleracea). 

V ) 



Jiang et al. BMC Genomics 2013, 14:72 
http://www.biomedcentral.eom/1 471 -21 64/1 4/72 



Page 6 of 1 1 



Bn-F2 Bn-F3 




Figure 4 Distribution of the genes commonly and specifically 
expressed in synthesized B. napus-F^, B. napus-F 2 , B> napus-F 3 
and B. napus-F 4 _ 

\ J 



(52.2%) genes were expressed at lower levels in Br than Bo 
(Table 2). The data suggest that the orthologous genes 
in polyploids are frequently expressed in a non-additive 
pattern. 

Cluster analysis of DEGs 

Cluster analysis of DEGs in the early generations of 
synthesized B. napus was performed with the correlated 
expression profile (Additional file 4: S4). Generally, 7948 
DEGs in all the comparisons were clustered as the union 
of DEGs. Up to 1358 DEGs in all the comparisons were 
clustered as the DEG intersections (Figure 6). Among 
the 12 major clusters, the upregulated transcripts were 
enriched in clusters A-K in the comparisons of Bn-N vs. 
Bn-F!, Br vs. Bn-F^ and Bo vs. Bn-F!. The downregu- 
lated transcripts were enriched in clusters A-K in the 



comparisons of Bn-F! vs. Bn-F 2 , Bn-F! vs. Bn-F 3 , and 
Bn-Fi vs. Bn-F 4 . Only one exception indicated that clus- 
ter L showed gene upregulation in Bn-F 2 , Bn-F 3 and Bn- 
F 4 compared with Bn-Fi, and downregulation in Bn-Fi 
compared with Bn-N, Br and Bo. We also identified the 
enrichment of DEGs in each cluster in different func- 
tional categories and found that the genes involved in 
the metabolism category were enriched significantly in 
all clusters except cluster H. Those in the protein with 
binding function or cofactor requirement category were 
enriched significantly in clusters A-H, whereas those in 
the subcellular localization category were enriched in all 
clusters except for cluster D. All the clustered DEGs are 
listed in Additional file 11: Sll. Multiple genes repre- 
sented for proteins with binding function, such as zinc 
ion binding (Bra019896), ATP binding (Bra008602, 
Bra023518), glutathione binding (Bra022815), selenium 
binding (Bra032756), and FAD/NADP/NADPH binding 
(Bra001931), showed different expression values in dif- 
ferent generations of resynthesized B. napus and its pro- 
genitors. Some genes that encode ribosomal proteins 
were also changed during polyploidization. 

Discussion 

Differences in gene expression between synthesized 
B. napus-F^ and its progenitors 

Based on analysis of the leaf transcriptome data for 
synthesized B. napus-¥ x and its progenitors, as well as 
the natural B. napus generated in this study, we found 
that the majority of the B. napus-¥ x transcripts were 
conserved with the parental sequences. Up to 19023, 
18547, and 24383 sequences were mapped by B. napus- 
Fi and its progenitors B. rapa and B. oleracea. 3182 
genes with same TPM were expressed in Bn-Fi and Br, 
and 2421 genes with same TPM were expressed in Bn-Fi 




Cellular Component Molecular Function Biological Process 



Figure 5 Histogram presentation of gene ontology classification of synthesized B. napus-?^. 22059 unambiguously mapped genes were 
assigned to three main categories: cellular component, biological process and molecular function. The right- and left-axis indicates the number of 
genes in a category and the percentage of the specific category of genes in the main category, respectively. 

) 
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Table 2 Number of non-additively expressed genes in resynthesized Bn-F t 





a % 


b 


% 


b/a (%) 


c 


% 


c/a (%) 




No. of non-additively expressed genes 


No. of non-additively expressed genes 


No. of non-additively expressed genes 




Hybrid versus MPV 




Hybrid > MPV 






Hybrid < MPV 




Bn-F] 


19785 


12012 




60.7 


7773 




39.3 


Br > Bo 


9456 47.8 


6134 


51.1 


64.9 


3322 


42.7 


35.1 


Br < Bo 


10329 52.2 


5878 


48.9 


56.9 


4451 


57.3 


43.1 



and Bo, which indicate that these genes might be signifi- 
cantly inherited by Bn-Fi from one of its progenitors. 
The genes with different TPM expressed in B. napus-Y\ 
and its progenitors, which indicate that the genome 
combination results duplicated genes in the resynthe- 
sized polyploid. However, only 14 of these commonly 
expressed genes showed TPM values in Bn-F! equal to 
the combination of its progenitors. Furthermore, most 
of the commonly expressed genes in the progenitors 
were non-additively expressed in Bn-F!, which might be 
responsible for gene repression and activation [5,28,39]. 
The majority of the non-additively expressed genes in 
Bn-Fi displayed expression values different to its pro- 
genitors, indicating that the transcriptome was recon- 
ciled during polyploidization [40]. Non-additive gene 
regulation should be involved in various biological path- 
ways, which may lead to subfunctionalization of duplicated 
genes [41]. The gene expression in the resynthesized Bn-Fi 
was more complicated than the simple combination of two 
genomes, which might be due to the homologous re- 
combination between closely related genomes (A- and 
C-genome) [42]. Xiong et al. [30] reported the variable 
chromosome instability of synthesized B. napus, and 
chromosome loss was compensated by the gain of 
homologous chromosomes [30]. Szadkowski et al. [43] 
reported the genome blender of synthesized B. napus 
during first meiosis, which resulted in the rearrange- 
ment of two genomes and the restructuring in further 
generations [43]. Thus, resynthesized polyploids derived 
from closely related species result in changes in the 
non-additive pattern of multiple gene expression, which 
explains the molecular bases for hybrid vigor [44] . 

Besides the commonly expressed genes in polyploids 
and diploid progenitors, there are some species specific 
genes. As shown in Figure 3, 487, 800, 3325 and 549 genes 
were specifically expressed in Br, Bo, Bn-Fi and Bn-N, re- 
spectively. The number of Bn-Fi specific genes was much 
higher than its progenitors, indicating many new tran- 
scripts emerged during polyploidization. This was consist- 
ent with Xu et al. [45] that 20 novel transcripts were 
detected in leaves of resynthesized Bn-Fi [45]. Based on 
the clustering analysis of DEGs, we found 1333 out of the 
1358 interspecific DEGs were up-regulated in Bn-Fi com- 
pared with Br and Bo. These up-regulated genes express 
proteins with different biological function, including salt 



tolerance protein, mitochondria/chloroplast membrance 
binding proteins, kinds of protein kinases, etc. Confirm- 
ation of these expression differences in further study is 
necessary for deep into the genetic causes of interest char- 
acteristics and improved qualities. Whereas 18 out of the 
1358 interspecific DEGs were down- regulated in Bn-Fi 
compared with Br and Bo. Only two genes (Bra005226 
and Bra026817) were up-regulated in Bn-Fi compared 
with Br, whereas down-regulated in Bn-Fi compared 
with Bo. Five genes (Bra040268, Bra038457, Bra033201, 
Bra037320 and Bra039516) were up-regulated in Bn-Fi 
compared with Bo, whereas down-regulated in Bn-Fi 
compared with Br (Figure 6). All these gene expression 
differences are of great interest for studying the genome 
polyploidization of Brassica species. 

Gene expression differences among early generations of 
synthesized B. napus (FtF^ and natural B. napus 

Hitherto, studies on polyploidy mechanism were conducted 
on natural and resynthesized B. napus using proteomic, 
transcriptome, and cytogenetic analyses [6,10,29,31,39,43]. 
However, few studies that trace genomic changes (includ- 
ing cytosine methylation, DNA fragment loss, genetic re- 
arrangement) and transcriptome changes in different 
generations of synthesized B. napus have been reported 
[9,27,28]. As mentioned above, many novel transcripts 
emerged in Bn-Fi compared its progenitors. The number 
of Bn-Fi specific genes was also higher than Bn-Fi, Bn-F 2 , 
Bn-F 3 (Figure 4). Many of these new transcripts emerged 
in the first generation of resynthesized B. napus disap- 
peared after successive self hybridizations, indicating the 
instability of resynthesized polyploids. Based on the clus- 
tering of DEGs found in the comparisons of Br vs. Bn-Fi, 
Bo vs. Bn-Fi, Bn-Fi vs. Bn-F 2 , Bn-Fi vs. Bn-F 3 , Bn-Fi vs. 
Bn-F 4 , and Bn-N vs. Bn-Fi, we easily found that most 
DEGs are significantly downregulated in Bn-F 2 , Bn-F 3 , 
and Bn-F 4 compared with Bn-Fi, but these DEGs were 
upregulated in Bn-Fi compared with its progenitors and 
natural B. napus. This phenomenon also indicated the 
genomic instability after polyploidization, and the discov- 
ery of multiple gene expression differences may compen- 
sate for the limitation of proteomic analysis [37]. The 
upregulation of DEGs in synthesized Bn-Fi during poly- 
ploidization could explain its improved characteristics 
compared with its progenitors. For example, the expression 
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Figure 6 Hierarchical cluster analysis of differentially expressed transcripts between each comparisons (Bn-F! vs. Bn-F 3/ Bn-F! vs. Bn-F 2/ 
Bn-F-| vs. Bn-F 4/ Bn-N vs. Bn-F lf Br vs. Bn-F l7 Bo vs. Bn-F-,. 'A' was the control and 'B' was experimental group in 'A vs. B'). The letters from A 
to L indicates the twelve major clusters resulted from HCE analysis. Pie charts represent functional classification of DEGs in each cluster based on 
MIPS functional catalogue. 
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pattern of Bra022585, which encodes stress-responsive 
protein and Bra040633, which encodes defense-related 
protein, were changed during polyploidization. Some of 
these gene expression differences might be related to 
the DNA methylation or gene fragment losses during 
the combination of two genomes [45]. For example, 
Bra003009, which encodes O-methyltransferase 1, and 
Bra006323, which encodes an N2-dimethylguanosine 
tRNA methyltransferase family protein, were upregu- 
lated in Bn-Fi during polyploidization and downregu- 
lated in the F 2 -F 4 generations. In the pathway analysis, 
we found Bra003009 participated in the flavone and fla- 
vonol biosynthesis, which functioned in the transform- 
ation of kaempferide into kaempferol. This may indicate 
that kaempferol level was different in the synthesized B. 
napus. The changes in expression include many transcrip- 
tion factors, binding proteins, and ribosomal proteins, 
must be important for the merging of the A- and C- 
genomes in synthesized polyploids. For the 18 DEGs, 
which were down-regulated in Bn-F! compared with Br 
and Bo, we found they were up-regulated in Bn-F 2 , Bn-F 3 
and Bn-F 4 compared with Bn-F!. This indicated that tran- 
scriptome recovery of genome balance functioned after 
polyploidization. Further research is needed to verify and 
determine the causes of these transcriptome changes and 
how they correlate with the phenotypic divergence and 
fertility of synthesized polyploids. 

Conclusions 

By applying the DGE deep sequencing, this study inves- 
tigated the transcriptome profile of resynthesized B. 
napus and its diploid progenitors, as well as natural B. 
napus, with an aim to illustrate the gene expression dif- 
ferences during polyploidization. The amounts of tran- 
scripts obtained provided a foundation for future 
research on polyploidy mechanism of B. napus. Globally 
identified DEGs and putative biological pathways 
revealed the gene expression difference between synthe- 
sized B. napus and progenitors, and differential expres- 
sion of genes across generations of synthesized B. napus. 
Generally, the gene expression in the resynthesized B. 
napus was more complicated than the simple combin- 
ation of two genomes, and non-additive gene regulation 
was also detected in synthesized B. napus. These find- 
ings provided a contribution to the existing sequence 
resources for Brassica and would certainly facilitate 
polyploidy research of this genus. 

Methods 

Plant material 

Seeds of resynthesized B. napus (Fi generation) and 
its successive selfing generations (F 2 -F 4 ), its diploid 
progenitors B. rapa (cv. Aikangqing) and B. oleracea 
(cv. Zhonghua Jielan), were generously provided by 



the Oil Crops Research Institute of the Chinese Academy 
of Agricultural Sciences. Natural B. napus cv. Yang6 was 
provided by the Jiangsu Institute of Agricultural Science 
in the Lixiahe District (China). All the plants were culti- 
vated in climate chambers at 25°C, a 16 h light: 8 h dark 
photoperiod, and 70% relative humidity. The first true 
leaves from three plants of each genotype were pooled at 
the same physiologic stage (28-day-old seedlings) and fro- 
zen at 80°C for use. 

RNA preparation, lllumina RNA-sequencing and data 
processing 

Total RNA was extracted from leaves using RNAiso Plus 
(Takara) according to the manufactures protocol. RNA 
concentrations were measured using a Qubit Fluorometer, 
and integrity was confirmed via 2100 Bioanalyzer (Agilent 
Technologies). The DGE libraries were prepared using 
lllumina Gene Expression Sample Prep Kit. Single-chain 
molecules were fixed onto a Solexa Sequencing Chip 
(flowcell) and sequenced by lllumina HiSeq™ 2000 System. 
Millions of raw, 35 bp sequences were generated. Image 
analysis, base calling, generation of raw tags, and the tags 
were counted using the lllumina pipeline [38]. Empty tags 
(no tag sequence between the adaptors), adaptors, low 
quality tags (tags containing one or more unknown 
nucleotides "N")> and tags with a copy number of 1 were 
removed from raw sequences to obtain clean tags (21 bp) 
containing CATG. 

Mapping of reads to the reference sequence 

To identify the gene expression patterns in early genera- 
tions of synthesized B. napus, all clean tags were anno- 
tated by mapping to the sequenced genome of B. rapa 
[34] using the SOAP2 software, with a maximum of 1 
nucleotide mismatch allowed [46]. All the tags mapped 
to reference sequences were filtered and the remaining 
tags were designated as ambiguous tags. Mapping events 
on both sense and antisense sequences were included in 
the data processing. For gene expression analysis, the 
number of expressed tags was calculated and then nor- 
malized to TPM (number of transcripts per million tags) 
[47,48]; and the DEGs were screened and used for map- 
ping and annotation [49,50]. Gene annotation was con- 
ducted using Blast2GO [51]. The gene ontology (GO) 
categorization of all DEGs was displayed as three hier- 
archies for cellular component, molecular function, and 
biological process by searching in the GO database. Web 
Gene Ontology Annotation Plot (WEGO) was also used 
for GO classification of genes mapped by each DGE li- 
brary [52]. Clustering analysis of differential gene ex- 
pression pattern was also conducted using hierarchical 
clustering explorer (HCE) [53,54]. In this study, statis- 
tical analysis of DEGs among libraries was performed 
using stringent value FDR < 0.001 (false discovery rate) 
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and the absolute value of |log2Ratio| > 1 as the threshold 
of significant difference of gene expression. The DEGs in 
the hierarchical clustering were grouped into functional 
categories based on MIPS functional catalogue using 
Arabidopsis orthologues [55]. Pathway enrichment ana- 
lysis of differential gene expression was conducted for 
further understanding gene function through blasting 
the KEGG database. A P-value of 0.05 was selected as 
the threshold for considering a gene set as significantly 
enriched. 
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Additional file 9: S9. Histogram presentation of gene ontology 
classification of B. rapa (Br), B. olerocea (Bo), B. napus-V^ (Bn-F]), B. napus-F 2 
(Bn-F 2 ), B. napus-F 3 (Bn-F 3 ), B. napus-F 4 (Bn-F 4 ) and natural B. napus (Bn-N). 

Additional file 10: S10. List of pathway enrichment analysis. Pathways 
with Q value < 0.05 are significantly enriched in DEGs, see red-border 
region ('A' was the control and 'B' was experimental group in 'A vs. B'). 

Additional file 11: S1 1. Both union and intersection DEGs used for HCE 
clustering analysis. 
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